Search CORE

73 research outputs found

LambdaLoss: Metric-Driven Loss for Learning-to Rank

Author: Bendersky Michael
Golbandi Nadav
Li Cheng
Najork Marc
Wang Xuanhui
Publication venue: Technical Disclosure Commons
Publication date: 31/05/2018
Field of study

How to directly optimize ranking metrics such as Normalized Discounted Cumulative Gain (NDCG) is an interesting but challenging problem, because ranking metrics are either flat or discontinuous everywhere. Among existing approaches, LambdaRank is a novel algorithm that incorporates metrics into its learning procedure. Though empirically effective, it still lacks theoretical justification. For example, what is the underlying loss that LambdaRank optimizes for? Due to this, it is unclear whether LambdaRank will always converge. In this paper, we present a well-defined loss for LambdaRank in a probabilistic framework and show that LambdaRank is a special configuration in our framework. This framework, which we call LambdaLoss, provides theoretical justification for Lamb-daRank. Furthermore, we propose a few more metric-driven loss functions in our LambdaLoss framework. Our loss functions have clear connection to ranking metrics and can be optimized in our framework efficiently. Experiments on three publicly available data sets show that our methods significantly outperform the state-of-the-art learning-to-rank algorithms. This confirms both the theoretical soundness and the practical effectiveness of the LambdaLoss framework

Technical Disclosure Common

WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning

Author: Bendersky Michael
Chen Jiecao
Najork Marc
Raman Karthik
Srinivasan Krishna
Publication venue
Publication date: 03/03/2021
Field of study

The milestone improvements brought about by deep representation learning and pre-training techniques have led to large performance gains across downstream NLP, IR and Vision tasks. Multimodal modeling techniques aim to leverage large high-quality visio-linguistic datasets for learning complementary information (across image and text modalities). In this paper, we introduce the Wikipedia-based Image Text (WIT) Dataset (https://github.com/google-research-datasets/wit) to better facilitate multimodal, multilingual learning. WIT is composed of a curated set of 37.6 million entity rich image-text examples with 11.5 million unique images across 108 Wikipedia languages. Its size enables WIT to be used as a pretraining dataset for multimodal models, as we show when applied to downstream tasks such as image-text retrieval. WIT has four main and unique advantages. First, WIT is the largest multimodal dataset by the number of image-text examples by 3x (at the time of writing). Second, WIT is massively multilingual (first of its kind) with coverage over 100+ languages (each of which has at least 12K examples) and provides cross-lingual texts for many images. Third, WIT represents a more diverse set of concepts and real world entities relative to what previous datasets cover. Lastly, WIT provides a very challenging real-world test set, as we empirically illustrate using an image-text retrieval task as an example

arXiv.org e-Print Archive

TRAINING A RANKING MODEL

Author: Bendersky Michael
Metzler Donald Arthur, Jr
Najork Marc Alexander
Wang Xuanhui
Publication venue: Technical Disclosure Commons
Publication date: 15/04/2021
Field of study

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a ranking machine learning model. In one aspect, a method includes the actions of receiving training data for a ranking machine learning model, the training data including training examples, and each training example including data identifying: a search query, result documents from a result list for the search query, and a result document that was selected by a user from the result list, receiving position data for each training example in the training data, the position data identifying a respective position of the selected result document in the result list for the search query in the training example; determining, for each training example in the training data, a respective selection bias value; and determining a respective importance value for each training example from the selection bias value for the training example, the importance value

Technical Disclosure Common

Exploring the Viability of Synthetic Query Generation for Relevance Prediction

Author: Bendersky Mike
Chaudhary Aditi
Hashimoto Kazuma
Najork Marc
Raman Karthik
Srinivasan Krishna
Publication venue
Publication date: 16/06/2023
Field of study

Query-document relevance prediction is a critical problem in Information Retrieval systems. This problem has increasingly been tackled using (pretrained) transformer-based models which are finetuned using large collections of labeled data. However, in specialized domains such as e-commerce and healthcare, the viability of this approach is limited by the dearth of large in-domain data. To address this paucity, recent methods leverage these powerful models to generate high-quality task and domain-specific synthetic data. Prior work has largely explored synthetic data generation or query generation (QGen) for Question-Answering (QA) and binary (yes/no) relevance prediction, where for instance, the QGen models are given a document, and trained to generate a query relevant to that document. However in many problems, we have a more fine-grained notion of relevance than a simple yes/no label. Thus, in this work, we conduct a detailed study into how QGen approaches can be leveraged for nuanced relevance prediction. We demonstrate that -- contrary to claims from prior works -- current QGen approaches fall short of the more conventional cross-domain transfer-learning approaches. Via empirical studies spanning 3 public e-commerce benchmarks, we identify new shortcomings of existing QGen approaches -- including their inability to distinguish between different grades of relevance. To address this, we introduce label-conditioned QGen models which incorporates knowledge about the different relevance. While our experiments demonstrate that these modifications help improve performance of QGen techniques, we also find that QGen approaches struggle to capture the full nuance of the relevance label space and as a result the generated queries are not faithful to the desired relevance label.Comment: In Proceedings of ACM SIGIRWorkshop on eCommerce (SIGIR eCom 23

arXiv.org e-Print Archive